Gene activity fully predicts transcriptional bursting dynamics

Transcription commonly occurs in bursts, with alternating productive (ON) and quiescent (OFF) periods, governing mRNA production rates. Yet, how transcription is regulated through bursting dynamics remains unresolved. Here, we conduct real-time measurements of endogenous transcriptional bursting with single-mRNA sensitivity. Leveraging the diverse transcriptional activities in early fly embryos, we uncover stringent relationships between bursting parameters. Specifically, we find that the durations of ON and OFF periods are linked. Regardless of the developmental stage or body-axis position, gene activity levels predict individual alleles' average ON and OFF periods. Lowly transcribing alleles predominantly modulate OFF periods (burst frequency), while highly transcribing alleles primarily tune ON periods (burst size). These relationships persist even under perturbations of cis-regulatory elements or trans-factors and account for bursting dynamics measured in other species. Our results suggest a novel mechanistic constraint governing bursting dynamics rather than a modular control of distinct parameters by distinct regulatory processes.


INTRODUCTION
Eukaryotic transcriptional regulation is an inherently dynamic and stochastic process. Multiple molecular events orchestrate, in space and time, the initiation of productive transcription by individual RNA polymerases (Pol II complexes), leading to the synthesis of nascent RNA [1,2]. The amount of transcribed mRNA molecules in turn shapes protein production and thereby dictates cellular behavior. Studies across various systems, from yeast to mammalian cells, have revealed that transcription occurs in bursts, namely the release of multiple Pol IIs in what is often referred to as an ON period, followed by a quiescent OFF period [3][4][5][6][7][8]. Yet it remains unclear how the kinetic parameters of transcriptional bursting determine mRNA production and govern spatiotemporal transcription dynamics. Is the transcription rate controlled primarily by tuning the durations of ON or OFF periods, the initiation rate (the rate of Pol II release during active periods), or by a combination of these? Furthermore, are distinct bursting parameters controlled by specific regulatory processes, and are distinct bursting strategies underlying temporal versus spatial (tissuespecific) control of transcription?
As a multitude of molecular processes is known to influence transcriptional activity, several studies aimed to uncover links between regulatory determinants and parameters of transcriptional bursting [9][10][11]. Transcriptional bursting is commonly described by parameters such as burst size, burst frequency, the kinetic rates governing ON and OFF times as well as transcription initiation * These authors contributed equally.
† Correspondence: tg2@princeton.edu [3,12,13]. Regulatory determinants, such as transcription factor (TF) binding, cis-regulatory elements, nucleosome occupancy, histone modification, and enhancerpromoter interactions were suggested to affect distinct bursting parameters [14][15][16][17][18][19][20][21][22]. Yet, it is difficult to integrate these observations and form a unified understanding of transcriptional control via bursting dynamics. Much of our quantitative knowledge about transcriptional bursting heavily relies on fixed data [3,14,[23][24][25][26]. Capturing transcriptional bursts in vivo across space and time remains challenging [20,[27][28][29]. Adding to this challenge, live measurements need to be quantifiable in absolute units (i.e., mRNA count) to facilitate comparisons between different genes and conditions [30][31][32]. Moreover, to understand the entire spectrum of bursting dynamics, there is a need to probe the full dynamic range of a gene's activity [25]. Measurements in an endogenous system where tightly regulated spatiotemporal transcriptional dynamics dictate cell-fate determination can further elucidate the functional consequences of bursts and their relation to different regulatory determinants. The early Drosophila embryo provides a unique system that meets all these requirements [33].
Here we quantify the endogenous transcription dynamics of key developmental genes in living Drosophila embryos. We identify a single control parameter, the instantaneous ON-probability of an allele as the dominant determinant of transcriptional activity, while the initiation rate is mostly conserved. This finding holds across spatial domains, developmental times, genes, and perturbations of cis-regulatory elements and trans-regulators. Surprisingly, we find a largely constant ON-OFF switching correlation time of roughly one minute. A corollary of the latter is that mean ON and OFF times are tightly coupled, regardless of the trans-environment or the cisregulatory architecture. While perturbations in the up-stream regulatory processes lead to dramatic changes in the ON-probability, i.e., spatiotemporal changes in transcription rate, the underlying changes in ON and OFF periods are predicted from wild-type. Instead of a particular perturbation type dictating changes in specific bursting parameters, we observe that more generally lowly transcribing alleles are tuned to higher expression levels by increasing burst frequency, while highly transcribing alleles are mostly tuned by increasing burst size.These results imply that for the examined genes, the bursting phenomenon can be quantitatively understood by a few simple rules: two of the transcription parameters are quasi-constant, and all others are determined by the ONprobability. Hence, future investigations necessitate a re-examination of our mechanistic understanding of transcription, focusing on how regulatory processes influence a unique control parameter.

RESULTS
Instantaneous single allele transcription rate measurements. To study the principles that govern transcription dynamics across space and time, we need access to the endogenous bursting kinetics at a single allele level. We designed an approach to obtain such quantitative measurements in living Drosophila embryos [31,35]. A versatile CRISPR-based scheme is employed to incorporate MS2 cassettes into the gap genes' introns (or 3'UTR) [36]. These form stem-loops in the transcribed nascent RNA that are subsequently bound by fluorescent coat-proteins (Fig. 1A, S1A and Methods) [37,38]. A custom-built two-photon microscope generates fluorescence images, capturing RNA synthesis at one tagged allele per nucleus with a 68-fold signal improvement over previous studies [31], approaching single-mRNA sensitivity ( Fig. S1D-E). An optimized field-of-view yields 10 s interval time-lapses for hundreds of nuclei per embryo, during nuclear cycles (NC) 13 and 14 ( Fig. 1A-B; Videos V1-V4), essential for statistical analysis.
To achieve a fully quantitative characterization, we calibrate our measurements to absolute units. We convert the fluorescence signal at the site of transcription into equivalent cytoplasmic mRNA units (C.U.) by matching the mean transcriptional activity to previously calibrated smFISH measurements (Fig. 1B, see Methods) [25,34]. We find high agreement between smFISH and live measurements, as a single conversion factor adjusts for the difference in fluorescence signal between the two methods, with an average relative error of ∼ 5% across all gap genes and MS2 insertion sites ( Fig. 1C and S1B). This agreement extends to higher moments (Fig. S1C), despite the fully orthogonal nature of these techniques: one being non-invasive genetically but involving fixation, while the other involves gene editing and stem-loop cassette insertions. This strongly suggests that our live approach captures the endogenous situation and provides means to express our dynamic transcription measurements in terms of absolute mRNA counts.
Our unique combination of absolute calibration and near single transcript sensitivity (Fig. S1D-E) allows us to reconstruct the underlying single allele transcription initiation events by individual Pol II, namely the events in which Pol II complexes are released onto the gene and engage in productive elongation. To infer these initiation events for each time series, we adopt a Baysian deconvolution approach that accounts for measurement noise ( Fig. S1D and Methods). The convolution kernel models the fluorescent signal resulting from the Pol II elongation process through the stem-loop cassette (with constant and deterministic elongation, Fig. 1D) [20,27]. For each time series, the approach generates multiple configurations of transcription initiation events (Fig. 1E). Averaging these configurations gives us a time-dependent instantaneous single allele transcription rate r(t) per time series (Fig. 1F).
We validate this kernel-based deconvolution approach by performing dual-color tagging of the gene body (a 5' proximal intron and a 3'UTR tag). These measurements support our key assumptions (see Methods) and allow us to extract a Pol II elongation rate of K elo = 1.8 ± 0.1 kb/min, which is in line with previous measurements [31,39] (Fig. S2). Our inferred transcription rates are thus no longer masked by the Pol II elongation dwell time, unlike the directly measured intensities of transcriptional activity. Transcription rates are thus independent of gene length, enabling the direct comparison between different genes and opening a path to identify common principles underlying transcription dynamics.
Single allele transcription rates hint at a universal bursting regime. Before analyzing the gap genes' transcription dynamics at the single allele level, we sought to determine whether the averages of the deconvolved single allele transcription rates r recapitulate the welldocumented average protein dynamics [40]. We compute a mean transcription rate R = r per gene along the anterior-posterior (AP) axis for all time points during NC13 and NC14 (Video V5); averaging occurs over 200-300 nuclei (each contributing one allele) in the same AP and time bin from 10-20 embryos ( Fig. 1A and 1B). The extracted mean transcription rate profiles ( Fig. 2A and S3A) strongly resemble the ones reconstructed from carefully staged gap gene antibody staining, including the well-documented posterior shifts during NC14. Indeed, with simple assumptions on diffusion and lifetime, the mean transcription rates R predict protein patterns with minimal post-transcriptional regulation (Fig. S4, Video V6). Thus, in this system, the rules governing transcription rate modulation will largely determine function, i.e., protein synthesis.
While the examined genes exhibit a common range of mean transcription rates R (Video V5), they display distinct spatiotemporal profiles, giving rise to gene-specific protein patterns. Yet when we examine the distributions previously calibrated fixed smFISH (black) measurements for all examined gap genes [25]. A global conversion factor (one for all genes) leads to a match within 5% error between live and fixed profiles (Fig. S1B), resulting in our activity unit, i.e., cytoplasmic unit (C.U.) equivalent to the intensity of a fully elongated transcript [34]. (D) Reconstruction of transcription initiation events from deconvolution of single allele transcription time series. The signal is modeled as a convolution between transcription initiation events and a kernel accounting for the elongation of a single Pol II through the MS2 cassette and the gene body (using an elongation rate K elo = 1.8 kb/min, Fig. S2). Bayesian deconvolution is performed by sampling from the posterior distribution of possible configuration of initiation events given the measured activity and measurement noise (Fig. S1D-E). (E) Example deconvolved initiation configuration (gray bars) and corresponding reconstructed signal (red) from a single allele transcription time series (black). (F) single allele transcription rate (gray) from same allele as in E (black). The rate is estimated by counting the number of initiation events within 10 s intervals for a given sampled configuration and averaged over 1'000 of such configurations. The displayed solid line and envelope for transcription rate (gray) and reconstructed signal (red) correspond to the mean and one standard deviation of the posterior distribution.
of single allele transcription rates, r, underlying a similar mean transcription rate, P (r|R), we find that these distributions collapse across genes (Fig. 2B). Strikingly, for low-to mid-levels of R, the underlying distributions differ starkly from a constitutive regime, which would result in a Poisson distribution. At these levels, the large amounts of non-transcribing or barely transcribing alleles hint at quiescent OFF periods deviating from a constitutive regime. As the mean transcriptional activity increases, the distributions become more Poissonian, sug-gesting that P ON , i.e., the probability of the genes being ON, increases. These observations are consistent with bursting behavior, where the gene alternates between ON and OFF states. The collapse of our data and the deviation from a constitutive, Poisson regime, are readily observed also when we compute the relationships between R and the higher moments of the P (r|R) distributions ( Fig. 2C  genes transition all the way from fully OFF (P ON = 0) to fully ON (P ON = 1). The gap genes thus provide an opportunity to investigate how an underlying bursting regime can account for a full dynamic range of transcriptional activities.
The dynamic nature of our measurement allows us to examine single allele transcription rates not only via distributions pooled across nuclei but also within individual transcription time series. We find the auto-correlation function of the single allele transcription rates provides further evidence for an underlying common bursting regime (Fig. 2D). An initial sharp drop of magnitude 1 − Σ AC at our sampling time scale (∼ 10 s) indicates the presence of uncorrelated noise, consistent with independent Pol II initiation events. This drop is followed by a longer decay of correlated noise at time scale τ AC . Such correlated noise is expected in a bursting regime, as the switching between ON and OFF states introduces temporal correlation in transcriptional activity. A theoretical and computational analysis using the two-state model of transcription [12] supports both the interpretation of the auto-correlation functions (Fig. S3D), and our ability to estimate Σ AC and τ AC properly from the deconvolved rates ( Fig. S3E-G).
We find both the magnitude of correlated noise Σ AC (Fig. 2E) and the correlation time τ AC (Fig. 2F) collapse across AP bins and different genes. Notably, the magnitude Σ AC is highly constrained and drops at high R, consistent with the behavior of the variance (Fig.  2C). Furthermore, the correlation time τ AC is largely conserved across nuclear cycles, across AP bins, and across genes, confined within the range of 1-2 min and averaging to a value of 1.37 ± 0.31 min (Fig. 2F). This surprising invariance of τ AC with respect to R suggests that a key temporal characteristic of transcription dynamics is highly conserved. Thus, both the static moment and correlation-based analyses point to a common bursting regime, applicable across genes and space, motivating a time-dependent analysis of transcriptional bursts at the level of individual time series.
Allele ON-probability is the key regulated transcriptional parameter. To directly quantify individual transcriptional bursts at the single allele level, we take advantage of our deconvolved initiation events and instantaneous transcription rate time series that are unencumbered by the signal-blurring effect of elongation (Fig. 3A). They allow us to identify distinct periods of active transcription, characterized by consecutive initiation events (i.e., multiple Pol IIs released into productive elongation), interpreted as ON periods, followed by quiescent periods, namely OFF periods (Fig. 3A). We define the switch of an allele from an OFF to an ON state when a moving average of the single allele transcription rate exceeds 2 mRNA/min (see Methods). This threshold is consistent with our detection sensitivity of 1-2 mRNAs, and the size of the moving window for averaging is set based on the correlation time scale from the auto-correlation analysis. The main strength of our burst calling routine is its sole reliance on a minimal clustering model, and as such being devoid of any mechanistic assumptions on the underlying bursts (no explicit mechanistic model is needed, see Methods).
Given a computed bursting profile (demarcated ON and OFF periods) for every single allele, we can now ask how bursting dynamics underlie transcription rates. Specifically, the mean transcription rate at time t, R(t), can be decomposed into two parameters: the instantaneous probability of an allele being in the ON state P ON (t) (i.e., the fraction of ON alleles) and the mean initiation rate in the ON state K(t). Starting with the gene hb, we thus estimate, for a given AP bin, the timedependent parameters R(t) and P ON (t) by averaging all (∼ 250) single allele instantaneous transcription rates and counting the fraction of alleles in the ON state at time t, respectively ( Fig. 3B-C). To compute K(t), we average initiation events restricted to the ON state (as opposed to R, which is averaging initiation events regardless of allele state). We repeat this procedure for each position of the AP-axis to obtain the full spatiotemporal dependence ( Fig. 3D-F). We validate our approach for burst calling and the recovery of bursting parameters from transcription time series on simulated data (based on a 2-state model) with an overall median error of 10% (see Methods, Fig. S5 and Fig. S6).
All three parameters vary significantly across both space and time ( Fig. 3D-F). However, given that these are related by R = K · P ON (Fig. S7A), it is possible that most of the variation in R stems from changes to either K, or P ON , or both. When R is plotted against P ON , all data points across time and space collapse in a tight monotonically increasing function (Fig. 3G), with P ON spanning from 0 to 1 (fully OFF to fully ON), echoing back to the noise analysis above ( Fig. 2B-C). Similarly, the initiation rates K tightly collapse across space and time when plotted against P ON . However, K only covers a two-fold change in dynamic range (Fig. 3H), which is marginal compared to R spanning from 0 to 15 (mRNA/min) ( Fig. 3I and S7B-C). This two-fold change is largely due to the existence of two optically unresolved sister chromatids and a modest time dependence of K throughout the nuclear cycle ( Fig. S7D-I, see Methods). With these considerations we estimate the mean Pol II spacing for a single active chromatid at 303 ± 73 bp, consistent with the classic Miller spreads with average Pol II spacing of 330 ± 180 bp [41].
Overall, we find that R is tightly controlled by P ON , while K is only moderately modulated and has significantly less predictive power over R. These results for hb suggest that transcriptional activity is mainly controlled through the probability of an allele being in the ON state. Once in the ON state, transcription initiates at a quasi-constant rate.   due to developmental regulation, transcriptional bursting in this system seems to operate in a near-steady-state regime.
While T ON and T OFF change over time and in different AP bins ( Fig. 4B-C), when we plot T ON and T OFF against P ON all data points collapse again across time and space onto two tight anti-symmetric relationships ( Fig.  4E-F). Various combinations of T ON and T OFF could potentially give rise to any given P ON , however, here we observe a highly restricted range for these mean durations. Hence a given P ON is unequivocally linked to a specific pair of T ON and T OFF , regardless of space and time.
An allele switching dynamically between ON and OFF states will have a correlation time T C , which determines, on average, the time needed for the single allele transcription rate to become uncorrelated. For such a system, T C can be computed directly from the mean ON and OFF times and is defined by 1/T C = 1/T ON + 1/T OFF (Fig.  4G, see Methods). Surprisingly, for all AP and time bins, T C is confined between 1 − 1.5 min and thus largely constant and independent of P ON . Moreover, T C matches quantitatively the correlation time τ AC from the autocorrelation analysis (Fig. 2F), which we found to be independent of the transcription rate. Given that T C characterizes allele switching correlations by construction, this match suggests that the nature of τ AC could indeed be related to bursting.
The fact that T C seems to remain conserved across space and time restricts the mean ON and OFF times. Indeed, T ON and T OFF can be expressed as a function of P ON and T C (via T ON = T C /1 − P ON and T OFF = T C /P ON , since P ON = T ON /(T ON +T OFF ) near the steady state, cf. Fig. 4D). Thus, the constancy of T C mathematically explains the tight anti-symmetric relationships of T ON and T OFF with respect to P ON (Fig. 4E-F), so that P ON not only governs the mean transcription rate R but also the entire transcriptional bursting dynamics.
Common bursting relationships underlie the regulation of all gap genes. Should we expect these bursting parameter relationships found for hb to generalize to other gap genes or are they gene-specific? The gap genes differ in their cis-regulatory elements, namely different numbers and arrangement of enhancers and promoters (Fig. S1A), and different compositions of transcription factors binding sites within each enhancer. Correspondingly, and as discussed above, the gap genes display distinct transcriptional activities along the body axis ( Fig.  1C) and across time (Video V8), which, in the case of hb we found to be largely governed by P ON .
When we apply our deconvolution and burst calling  Figure S8D). procedure ( Fig. 3A) to the measured single allele transcription time series of other gap genes (gt, Kr, and kni ), we observe that all genes differ substantially in their spatiotemporal P ON profiles (Fig. 5A). These differences are expected and reflective of the above-mentioned distinct underlying cis architectures and trans regulators of these genes. Strikingly, despite these differences, they show an almost identical mean transcriptional rate R to P ON relationship (and K to P ON relationship, Fig. S8C), substantiating P ON as the governing factor for the transcriptional activity not only across time and space but also across genes (Fig. 5B).
Computing the various bursting parameters for all gap genes and plotting these as a function of P ON , we find the genes display the same P ON -dependent relationships: all genes share common T OFF to P ON (Fig. 5C) and T ON to P ON (Fig. 5D) relationships. Thus, when different genes display a specific P ON value, possibly at different spatiotemporal coordinates, the underlying T ON and T OFF periods employed are nonetheless largely similar. This finding can be related to the switching correlation time T C , which we find to be conserved across genes (Fig.  5E). The average T C value across all genes, positions, and times is 1.25 ± 0.37 min, very close to the prediction from the single rate auto-correlation analysis (Fig. 2F).
Pooling all our data across all genes, times, locations, and embryos (N > 10 6 data points) and plotting each of the computed bursting parameters (R, K, T C , T OFF and T ON ) against P ON (Fig. 5F) -including the oftenused burst frequency F = 1/(T ON + T OFF ) and burst size B = K · T ON (Fig. S8E) -reveals highly constrained relationships. Indeed, all data points occupy only a very small subset of the parameter space (see methods). Tight functions can be mapped out (black lines) that confirm the exceptional predictive power conferred by P ON . Separating all the data into three developmental time windows (NC13 & early NC14, mid NC14, and late NC14), shows a further tightening of the relationships in these developmental stages (Fig. S9). This separation thus confirms that parts of the dispersion results from slow and moderate changes in K (and to a lesser extent T C ) over developmental time (at most 40% reduction of K and 25% for T C ).
All measured bursting dynamics seem to adhere to this simple set of rules, across genes, space, and time. The mean transcription rate R is essentially dictated by P ON , with a largely constant K. A near-constant switching correlation time T C leads to a specific functional relationship for T ON and T OFF of inverse proportionality. Thus, only one of these two parameters is principally modulated and the other remains quasi-constant. While lowly transcribing alleles are tuned to higher expression levels predominantly by decreasing T OFF , medium-to-high transcribing alleles are mostly tuned by increasing T ON . These simple rules contain all information necessary to govern bursting and consequently transcriptional activity in the system.
The common bursting relationships predict the effects of cis-and trans-perturbations. Diverse regulatory mechanisms have been implicated in the control of transcriptional activity, including cis-regulatory elements (e.g., enhancers) and trans-factors (e.g., TF repressors). It is often assumed that distinct regulatory mechanisms directly control distinct bursting parameters. Will the established bursting parameter relationships based on wild-type measurements predict bursting dynamics when we perturb regulatory mechanisms?
To address this question, we devised a strategy to perturb the endogenous system in cis and in trans. For cis-perturbations, we delete a distal hb enhancer from the hb locus that has an MS2-stem loop cassette in the first intron (Fig. 6A). This enhancer removal has a complex effect on hb activity, including increased and decreased transcriptional activity at different times and AP locations (Fig. 6B), consistent with previous observations [42][43][44]. Despite the stark deviation in the mutant spatiotemporal transcriptional activity compared to the wild-type, we find that mean transcription rates across space and time are governed by P ON . Thus, the predictive power of this parameter observed for the wild-type holds for the mutant as well (Fig. 6B). The mutant further adheres to the other bursting relationships identified in the wild-type. In particular, the restrictive T ON and T OFF to P ON relationships hold, as well as the largely conserved switching correlation time T C (Fig. 6C).
A second enhancer deletion, namely, the removal of the kni distal enhancer results in a significant reduction in kni activity. The mutant samples have a smaller dynamic range of activity, yet we find a similar data collapse within that range ( Fig. S10A-C). Finally, to examine a trans-perturbation, we measure kni activity in embryos with a hb null background. kni activity was substantially altered, consistent with earlier studies [45] (Fig.  S10D-E). Yet, again, we observe the collapse of the mutant bursting parameters onto the wild-type busting rules (Fig. 6D).
The consistency of these bursting rules suggests that the wild-type derived relationships (Fig. 5F) can predict how changes in T ON and T OFF account for the change in transcriptional activity upon the perturbation. Specifically, to examine the perturbation's effect on transcriptional activity at a given AP bin and time in development, we can consider the pair {P wt ON , P mut ON } at that spatiotemporal position (two examples of such pairs are marked in the kymographs in Fig. 6B). Using the dependencies of T ON and T OFF on P ON , we predict whether the change in transcriptional activity (i.e., the change from P wt ON to P mut ON ) stems predominantly from a change in T ON or in T OFF (Fig. 6E). For our two example pairs, one is predicted to be mostly governed by T OFF ("o" mark), while the other is mostly governed by TON (" " mark). This exercise can be generalized to all {P wt ON , P mut ON } pairs and subsequently verified using the measured T ON and T OFF values for wild-type and the perturbations ( (C-D) Transcriptional parameters for hb in cis-mutation (C, cyan) and for kni in trans-mutation (D, light green) collapse on corresponding wild-type parameters (gray). cis-mutation is the hb distal enhancer removal from A and the trans-mutation is a hb null background compared to wild-type ( Fig. S10D-E). Solid black lines correspond to the endogenous bursting rules from pairs correctly verify the type of modulation predicted by our rules (Fig. 6F). We reached the same conclusions predicting changes in burst size and burst frequency ( Fig. 6G and S11G-I). Importantly we find that each type of perturbation displays both predominant T ON or T OFF modulation at different times and positions. The generalization of our bursting rules to cis-and trans-perturbations have strong consequences. From our examination, the type of perturbed regulatory mechanism can hardly be linked to changes in a specific bursting parameter and vice versa. Indeed, as is the case for wild-type, P ON emerges as the main governing parameter of the transcriptional activity also for the mutants, while K and T C remain largely unchanged. P ON changes upon a perturbation (i.e., wild-type to mutant) are sufficient to determine the corresponding changes in R, T ON , and T OFF . Moreover, the functional form of the relationships implies that the P ON regime (low versus high) is crucial in determining which parameter is predominantly affected. Different regulatory mechanisms have a different propensity for P ON modulation. Yet, the single-parameter regulation set in place by the identified bursting rules points towards general mechanisms conserving K and T C , and linking T ON and T OFF .

DISCUSSION
While it is appreciated that mRNA production likely occurs in bursts across various systems, quantitatively measuring endogenous, single allele transcriptional bursting in real-time, across a wide range of gene activities, still poses a challenge. This hinders our understanding of how the kinetic parameters of bursting underlie transcriptional dynamics across genes, space, and developmental time. In this study, we devised an approach to perform such measurements in the context of the developing early Drosophila embryo, a system that relies on large changes in transcription rates, as a means to govern protein abundance (Fig. S4A). The spatiotemporal transcriptional activity of the examined genes is regulated by a myriad of processes (e.g. repressor and activator binding, chromatin accessibly, PIC formation and pause-release, histone modifications, etc.) [42,[46][47][48]. These processes are mediated by gene-specific cisarchitectures, with distinct combinations of enhancers and promoters, further differing in their internal motif compositions. Surprisingly, despite the complexity of the regulatory processes involved and the differences between the genes, we find highly restricted, unifying properties of the underlying bursting dynamics.
We observed that the mean transcription rate is governed by a single tunable control parameter, the ONprobability (P ON ), with a near-constant Pol II initiation rate (K). We further found a conserved time scale (T C ) of about a minute, over which the allele states, either active (ON) or inactive (OFF), remain correlated. This time constant explicitly links the mean durations of ON and OFF periods that are thus largely determined by the control parameter (P ON ). P ON , together with the mostly conserved K and T C , fully parameterize the observed bursting dynamics, across genes, space, and developmental time.
While consistent with predictions from our previous measurements on fixed tissues (Fig. S12A) [25], the presented live measurements allow us to go beyond a modelbased inference of kinetic parameters from a static snapshot of distributions of nascent transcripts. Our current approach measures the dynamics directly rather than inferring the kinetics indirectly. Therefore, we are relieved from the constraints imposed by a specific mechanistic model and can thus further relax the steady state assumption. Both aspects are often used in the analyses of fixed and live data [3,6,25,49]. Moreover, access to the full dynamics of individual transcription time series (throughout more than 1 hour of development) allows us to estimate time-dependent transcriptional parameters from the underlying bursts, rather than obtaining only a single transcriptional parameter value per condition.
Using synthetic data, we verify the capacity of this procedure to reliably recover a wide range of bursting parameters, including the estimation of T C (Fig. S5C-D). The conserved nature of T C and its value (i.e., 1.25±0.35 min as estimated from individual bursts) quantitatively agree with the auto-correlation analysis (τ AC = 1.37 ± 0.31 min), an orthogonal approach not involving burst calling. It is intriguing to consider the functional consequences of a constant correlation time T C and of the relatively small measured value. T C not only sets the time scale of the bursting dynamics, linking the mean ON and OFF times, but it has further implications on transcription noise filtering: a small T C value minimizes noise as bursts are easily buffered by long mRNA lifetimes. A small T C further allows gene transcription to respond more rapidly to input TF changes, by means of facilitating a fast relaxation to a steady state (Fig. S7J-K).
The identified bursting rules point to a surprising predictive power of P ON on the mean ON and OFF times (Fig. 5F). This observation is only possible because our system allows for the quantification of bursting parameters across a large dynamic range of transcriptional activity, with P ON values ranging from fully off (0) to fully on (1) for most of our examined genes and conditions. This leads us to uncover a strict relationship between T OFF and T ON (Fig. 6E and Fig. 7A). It is commonly thought that distinct regulatory processes will alter transcriptional activity by predominant regulation of specific bursting parameters (e.g., the mean ON durations of a burst versus the mean OFF intervals in between bursts). Yet, when we perform a cis-or trans-perturbation, which substantially alters transcriptional activity, the predominantly modulated bursting parameters (T OFF versus T ON or burst frequency versus burst size) can be predicted by the wild-type and mutant P ON regimes, rather than by the type of the perturbation performed (Fig. 6E-G and

S11).
To further examine the generality of these results, we investigated previously published transcription measurements in the early fly embryo. These measurements include both endogenous genes and synthetic reporters, where transcription was altered by varying BMP signal (a dorsoventral morphogen) [28] or core promoter composition [20]. Strikingly, we find that these datasets collapse on our established bursting rules (Fig. 7A). As suggested in these studies, the first dataset shows mainly effects on T OFF , while the latter principally changes T ON . Intriguingly, the two independent datasets cluster in disjoint halves of the full spectrum of P ON values captured by our measurements. Our analysis thus raises the possi-bility that the predominantly changed parameter (T OFF versus T ON ) might not be inherent to the examined regulatory manipulation (e.g., input TF concentrations or core promoter elements), but rather a consequence of the limited expression range of these genes.
The striking manifestation of general rules underlying bursting parameters in the Drosophila embryo, as well as the conserved nature of regulatory processes and the transcription machinery across eukaryotes, naturally leads to the question of whether our rules apply more broadly [50]. Recent measurements of an extensively perturbed yeast gene [21] provide an equivalent set of bursting parameters that faithfully adhere to our transcription rules (Fig. 7A). Although, the yeast gene initiation rate K appears mostly constant across conditions (similar to the fly genes), calibration of its value was only possible under specific hypotheses leading to a lower bound that needs to be further tested (Methods). Earlier yeast gene data have shown a significantly smaller initiation rate [5,38]. A comprehensive study allowing multiple gene classes across multiple organisms will be necessary to verify whether our rules hold more generally in eukaryotes.
Mammalian genes are often lowly transcribed [6,29,51], potentially exploring a different parameter regime. Indeed, while bursting parameters inferred from measurements of a luminescent reporter protein [6] are restricted to very small P ON values (0 to 0.05), they appear to be consistent with our established relationships (Fig. 7A). Again, only the initiation rate K seems to deviate. Possible deviations in K across species leave open the possibility that the overall levels of K might be linked to species-specific rates, such as those linked to metabolism [52,53]. Mammalian time-lapse data with a broader P ON range will be necessary to make a stronger parameter comparison possible.
Revisiting previously performed genome-wide studies in mammalian systems shows trends that are possibly compatible with our established bursting relationships. Similar to our fly genes (Fig. 7A), one study found that while burst frequency is predominantly modulated for low expressing genes, burst size is tuned for high expressing ones, independent on the reporter control sequences [13]. Another study using single-cell RNA-seq found a functional dependence of burst frequency and burst size on mean expression that seems compatible with our established rules (for P ON < 0.5) [51]. However, scRNAseq parameters are significantly influenced by mRNA loss and long mRNA lifetimes, making the mapping of the sets of units between the vastly different approaches challenging.
The potentially wider applicability of our bursting relationships to other species calls for a new framework underlying the regulatory processes governing transcription. Instead of specific regulatory processes being inherently linked to specific bursting parameters, the tuning of transcription seems to be funneled through the sole control of P ON , that all regulatory processes act on (Fig. 7B). Future investigations will have to determine the molecular mechanisms that can implement such a funneling at the level of P ON control. How do diverse processes tune P ON ? How is the constancy of the switching correlation time T C implemented molecularly? As for the latter, our work suggests that a highly conserved and general mechanism across eukaryotic genes should be at work. The generality of such a mechanism implies independence from the particularity of a given gene locus and thus could rather be implemented by the environment, including structural consideration of the nuclear architecture [54][55][56] or the molecular assembly of components of the transcription machinery [57,58].
Despite the complexity of transcriptional dynamics across species, genes, space, developmental time, and perturbations, our quantitative real-time measurements revealed strict bursting rules, that set strong constraints on mechanistic models of transcriptional regulation. Our work also has some indicators for the generality of these rules across systems, their functional implications, and their molecular underpinning. As is the case with other areas in which organizing principles are increasingly emerging [59][60][61][62][63], these rules offer new ways to think about complex processes and point to conserved mechanisms at their core. In absence of imaging error, the transcriptional activity in the green and red channels when calibrated to C.U. should perfectly correlate (on the diagonal). We fitted the spread σimg orthogonal to the diagonal (black line, slope one) to characterize the imaging error; assuming σ 2 img scales as σ 2 b + αI with mean intensity I, where σ 2 b is the background noise and αI a Poisson shot noise term. The resulting fit for σimg is highlighted by the dashed lines (plus minus one std around the diagonal). (Right) Imaging error estimation from the single allele transcriptional time series (with the assumption that the measured transcriptional fluctuations result from the sum of uncorrelated imaging noise and correlated noise due to the elongation of tagged nascent transcripts). We computed the time-dependent mean activity µ(t), variance σ 2 (t) and covariance between consecutive time points Cov(t, t + ∆t) (where ∆t is 10 s), over all nuclei within 1.5-2.5% AP bins for all measured genes. The uncorrelated imaging variability σ 2 img is then approximated by σ 2 img − Cov(t, t + ∆t), which is plotted as a function of µ(t) for all time points. We characterized σ 2 img by fitting the data with a line σ 2 b + αµ. Fitting results are shown in E. (E) Our two estimates for the imaging error (interlaced dual-color construct (solid line) and correlation-based approach (dashed line)) are consistent. The signal-to-noise-ratio (SNR), defined as µ/σimg, is close to one (dotted line) when µ ≈ 1, indicating that the sensitivity of our live measurements is close to one mRNA molecule. (F) Fractional embryo variability profiles as a function of AP position and developmental time, for all gap genes. We define embryo variability σ 2 emb as the variance of the mean activity across embryos, and report the fractional embryo variability as the ratio σ 2 emb /σ 2 , where σ 2 = σ 2 emb + σ 2 nuc is the total variance, and σ 2 nuc corresponds to the transcriptional allele-to-allele noise across nuclei. Overall, the fractional embryo variability is ∼ 10%, meaning that most of the variability arises from σ 2 nuc . Thus, together D, E, and F show that σ 2 tot = σ 2 img + σ emb + σ 2 nuc is a good proxy for σ 2 nuc , which is the relevant noise contribution that contains all the bursting phenomenology.   In both cases, the two cassettes were labelled using two different colors (MCP-GFP green and PCP-mCherry red). Since the two signals are correlated through the elongation process, the simultaneously measured pair of time series has a further constrained set of underlying initiation configurations and represents thus a good test for the approach. To deconvolve single allele dual color time series together (i.e., a single train of polymerases needs to match two signals), using two kernels modeling each loop-cassette location and satisfying our key assumptions (i. constant and deterministic elongation rate; ii. no Pol II pausing/dropping in gene body; iii. fast termination). In addition, the dual-color strategy allows estimation of the average elongation rate from the overall delay between the two signals (using the known genomic distance between the MS2 and PP7 insertion sites). (B) Dual-color signal reconstructed from deconvolved single allele transcription time series (black lines for raw measured data). single allele transcription rate (gray line with one std envelope) is deconvolved from the single depicted pair of measured time series (black lines). The signal (red and green lines with one std envelope) is devoid of imaging noise (as it was modeled from Fig. S1D during the deconvolution process) and is reconstructed by convolving back the resulting transcription rate with the kernel of each channel. Qualitatively, the signal (color) matches well (see C) the measured time series (black) in strong support of our kernel assumptions. (C) Distribution of residuals from the dual-color reconstruction. We quantified the mean and standard deviation of the normalized residuals, i.e., of the difference between the measured signal (black in H) and the reconstructed signal Both the colored and dashed profiles agree, justifying our deconvolution approach. Error bars are one standard deviation across embryo means. Overall, we have effectively deconvolved Ng = 7 "genes" (4+gt male and female and anterior and posterior regions), over Nt = 362 time points (NC13+NC14), across Nx =9-18 positions, leading to a total of 33 214 bins, each averaging ∼ 200 nuclei (with a single allele per nucleus). (B) Fraction of spatial and temporal bins whose single allele transcription rate distribution P (r) is consistent with the conditional transcription rate distribution P (r|R) determined by pooling nuclei over multiple bins at a given mean transcription rate R. We computed the 95% confidence interval on the cumulative distribution of P (r|R) and checked for all the underlying bins at a given R whether their individual cumulative distribution was within the overall confidence interval. We repeated this process for four distinct developmental time windows: NC13 (6.5 ≤ t min after mitosis) plus early NC14 (7.5 ≤ t < 20.5 min), mid NC14 (20.5 ≤ t < 34.5 min), late NC14 (34.5 ≤ t < 48 min), and a wider NC14 window (7.5 ≤ t < 48 min). Overall, bins that share similar R within the same time window have very similar P (r) distribution (median given by dashed line over 80%), which justifies the pooling of these bins. On the other hand, when pooling bins over the whole NC14 we observe further dissimilarities between bins, suggesting that P (r|R) might moderately change over time. (C) 3 rd cumulant and 4 th cumulant of single allele transcription rate as a function of mean transcription rate R in NC13 (square) and early NC14 (7.5 ≤ t < 20.5 min; circle). The single allele transcription rates are estimated within 1-min-intervals, highlighting a strong departure of the variance from a constitutive Poisson regime (dashed line, κ3 and κ4 = R). All gap genes follow the same trend suggesting a common bursting regime. (D) Interpreting auto-correlation function using the 2-state model of transcriptional bursting. In this model for a single gene copy (top), the gene promoter switches stochastically between an OFF and ON state with rates kOFF and kON, where in the latter Pol II can be loaded at rate K (1) and elongate at rate K elo . Of note, the superscript (1) specifies the parameter for a single gene copy. The auto-correlation functions displayed here are computed from the model using a switching correlation time T (1) C = 1/(kOFF + kON) = 2 min, a Pol II elongation time τ elo = Lg/K elo = 2 min (where Lg is the gene length) and an initiation rate K (1) = 8 mRNA/min; the steady state ON-probability P (1) ON = kON/(kON + kOFF) varies from 0 to 1 (i.e., blue to red color code, fraction of nuclei in ON state or fraction of time a nucleus is in ON state). In principle, promoter switching (generating bursts) leads to temporal correlations in the transcriptional activity time series (Activity AC). However, from the raw live measurements, these correlations are hard to distinguish from the ones introduced by elongation (left), specifically when the switching correlation time T (1) C is close to or smaller than the elongation time τ elo . Instead, performing auto-correlation analysis on deconvolved single allele transcription rates resolves the switching correlations (Transcription AC, right), since correlations due to elongation have been removed. Thus, the switching correlation time T  (1) C from 0.5 to 10 min), we performed single allele deconvolution and computed the auto-correlation on the resulting transcription rates. We estimated the correlation time T (1) C and the magnitude ΣAC by fitting exponential. Both parameters are properly retrieved with minimal biases. Color code stand for P (1) ON and dashed line for slope 1. (G) Estimating deconvolution biases due to elongation rate measurement bias. As in F, we generated simulated data (P ON from 0.03 to 0.9 and T (1) C = 2 min) and aimed to deconvolve the data with elongation rate higher (orange dot) or lower (yellow dot) than the correct value (blue dot, used to generate the data as in F). Overall, the estimated parameters are estimated correctly, with larger biases at large P  A simple modeling attempt for protein accumulation from mean transcription rate measurements. The mean transcription rate (left column) across space and time is estimated by normalizing the measured mean activity by the elongation time and applying a minor correction for the delay (< 1 min) resulting from the loop insertion location. Horizontal white dashed lines correspond to the transition (mitosis) from NC13 to NC14. Protein accumulation (middle column) is computed from the mean transcription rate as the convolution of the latter with a kernel modeling protein decay, diffusion, and delay due to mRNA export, translation, and nuclear import. This simple model introduces three free parameters, a protein lifetime, a diffusion constant, and a time-delay (see B). These three parameters were set by minimizing the mean squared error with previously measured protein patterns (right column; Dubuis et al., 2013). Small residual deviations between middle and right columns might be due to post-transcriptional regulatory processes that our simple model does not account for. (B) Parameters estimated for the modeled accumulation of effective proteins as described in A. The three parameters were either estimated for each gene separately (color bars) or all genes together (dashed lines, used for middle column in A). Overall, the effective parameters are mostly in line with previous estimates [64].
ON ), we generated simulated data (200 alleles and 50 min long time series at 10 s intervals and with the same imaging noise as measured in real data). Using the two-state model (Fig. S3D) and the Gillespie algorithm, we generated time series for each individual sister chromatid. We summed the initiation events of both chromatids assuming independence. The resulting initiation events were convolved with the elongation kernel to generate synthetic single allele signal data over which measurement noise was added. We performed single allele deconvolution and burst calling as described (Fig. 3A) ON ) and corresponds to the median effective parameter and the error bars to the 68% confidence interval estimated over 50 min. Our deconvolution and burst calling approach leads to an excellent estimation of the effective parameters over a large range of T  C approaches 0.5 min (blue dots). Importantly, biases for the effective switching correlation time TC are small, supporting our ability to detect its constancy in real data. (D) Summary of median relative error for each effective parameter estimated from the data in C as a function of input T (1) C . Parameter estimated from real data (Fig. S9B) suggests that T (1) C lies within 1 and 3 min (black border). (E) Summary of median relative error for each effective parameter estimated from the whole data set in C as a function of the burst calling parameters. Burst calling depends on two free parameters: the time window W over which the rate is estimated, and the rate threshold R d applied to call the burst (see Fig. 3A). Our default parameter values are W = 1 and R d = 2, which should be close to optimal given the estimated correlation time of τAC ∼ 1 min (Fig. 2F) and our measurement sensitivity of 1-2 mRNA. When testing the effect of different W and R d values on the median relative error, our default choice leads to the lowest global relative error.  ON . The nonlinearity is mostly explained by temporal changes in K (1) , as highlighted by dotted lines whose slopes are the K (1) values from F and G. (I) Under the two independent sister chromatids assumption, the effective initiation rate K depends on PON and on K (1) , which varies as a function of time (see F). As PON increases, the propensity to observe two gene copies initiating transcription at the same time increases, which explains up to a factor of two in the dependence of K on PON. Indeed, the dotted lines correspond to the predicted behavior using the same three constant values of K (1) as in F. In addition, K (1) varies by up to 38% along time during NC14, as can be seen in F. Together, it explains close to a factor of 3. for all gap genes estimated within three time windows, corresponding to NC13 (6.5 ≤ t min) plus early NC14 (7.5 ≤ t < 20.5 min), mid NC14 (20.5 ≤ t < 34.5 min), and late NC14 (34.5 ≤ t < 48 min). The observed bursting relationships are further refined when accounting for possible temporal changes (color scatter) compared to all time-pooling (gray scatter, Fig. 5F). Indeed, small changes in K and TC over developmental time (∼ 40% decrease) explain part of the observed spread in Fig. 5F. (B) As in A, but for single gene copy parameters computed from the effective ones assuming independent sister chromatids. Interestingly, the relationship between sgc transcription rate R (1) and sgc ON-probability P ON is almost linear, confirming that the sgc initiation rate K (1) does not depend strongly on P ( ON (1)). Thus, the apparent dependence of K (1) on P ( ON (1)) is only effective and results from measuring two sister chromatids (two gene copies) together, instead of an isolated single gene copy. ). The solid black lines delimit the region, where changes in TOFF and TON are not significant given the "thickness" of our relationships (95% confidence intervals, see Methods). Thus, this procedure defined a look-up table enabling prediction of the type of modulation using pairs of PON. (B) Scatter plot of all the PON pairs from hb wt and cis-mutant (at same spatiotemporal location). Colors correspond to the predicted modulation (TOFF dominated in gray and TON dominated in blue) using the look-up table in A. (C) Verification of predicted modulation in B (color code as in B). For each PON pair, we computed the TOFF (T mut OFF /T wt OFF ) and TON (T mut ON /T wt ON ) fold change using the estimated TOFF and TON from data (Fig. 6C). Supporting our ability to predict the modulation, almost all the blue data points (predicted as TON modulation) are located above the slope 1 diagonal (dashed line), whereas most of the gray ones (predicted as TOFF modulation) are below. Thus, for most data points (> 85%) the prediction is correct (Fig. 6F). (D-F) Initiation rate K, burst size B, and burst frequency F for hb in cis-mutation (cyan), kni in cis-mutation (olive) and for kni in trans-mutation (light green) collapse on corresponding wild-type parameters (gray). Solid black lines correspond to the endogenous bursting rules from Fig. 5F and S8E. (G-I) Predicted F versus B modulation for mutant based on wild-type-derived rules ( Figure S8E, black lines). (G) F and B are first approximated as function of PON using the wild-type rules. The predicted fold change in F (F mut /F wt ) and B (B mut /B wt ) are then computed for all possible pairs of PON (i.e. P wt ON and P mut ON ). The dotted line delimits the regions where changes in transcription rate are either dominated by changes in F (gray region, | log (F mut /F wt )| > | log (B mut /B wt )|)) or B (blue region, | log (F mut /F wt )| < | log (B mut /B wt )|)). As for TOFF and TON (in A), the solid black lines delimit the region of significance and we have thus defined a look-up table enabling prediction of F vs B modulation using pairs of PON. (H) Scatter plot of all the PON pairs from hb wt and cis-mutant (at same spatiotemporal location). Colors correspond to the predicted modulation (F dominated in gray and B dominated in blue) using the look-up table in G. (I) Verification of predicted modulation in H (color code as in H). For each PON pair, we computed the F (F mut /F wt ) and B (B mut /B wt ) fold change using the estimated F and B from data (E and F). Supporting our ability to predict the modulation, almost all the blue data points (predicted as B modulation) are located above the slope 1 diagonal (dashed line), whereas most of the gray ones (predicted as F modulation) are below. Thus, for most data points (> 95%) the prediction is correct ( Figure 6G).